Exploration parameter input
run_param.json
Settings for training (network structure, optimizer), exploration (LAMMPS settings, sampling strategy), and tagging (VASP/PWmat self-consistent calculation settings).
Parameter List
reserve_work
Specify whether to keep the temporary working directory. The default value is false
. After each active learning iteration, the temporary working directory is automatically deleted.
reserve_md_traj
Specify whether to keep the MD trajectory files. The default value is false
. After each active learning iteration, the MD trajectory files are automatically deleted.
reserve_scf_files
Specify whether to keep all result files from self-consistent calculations. The default value is false
. If set to false
, after each active learning iteration, only four files (REPORT
, etot.input
, OUT.MLMD
, atom.config
) are kept for PWmat self-consistent calculation, and three files (OUTCAR
, POSCAR
, INCAR
) are kept for VASP.
init_data
Specify the directory of the initial training set in list format. It can be an absolute path or a relative path (current directory).
train
Parameters for model training, specifying the network structure and optimizer. For detailed parameter settings, refer to the PWMLFF documentation
. You can either set all the training parameters as shown in the example, or use a separate JSON file by specifying the path to the JSON file in the train_input_file
parameter.
train_input_file
Optional parameter. If you have a separate PWMLFF input file, you can specify the path to the file using this parameter.
Optional parameter, if you have a separate PWMLFF input file, you can use this parameter to specify the file path. Otherwise, you need to set the parameters as shown in the example below. Detailed explanations of the parameters can be found in the PWMLFF parameter list.
"train": {
"model_type": "DP",
"atom_type": [
14
],
"max_neigh_num": 100,
"seed": 2023,
"data_shuffle":true,
"train_valid_ratio": 0.8,
"recover_train": true,
"model": {
"descriptor": {
"Rmax": 6.0,
"Rmin": 0.5,
"M2": 16,
"network_size": [25, 25, 25]
},
"fitting_net": {
"network_size": [50, 50, 50, 1]
}
},
"optimizer": {
"optimizer": "LKF",
"epochs": 30,
"batch_size": 4,
"print_freq": 10,
"block_size": 5120,
"kalman_lambda": 0.98,
"kalman_nue": 0.9987,
"train_energy": true,
"train_force": true,
"train_ei": false,
"train_virial": false,
"train_egroup": false,
"pre_fac_force": 2.0,
"pre_fac_etot": 1.0,
"pre_fac_ei": 1.0,
"pre_fac_virial": 1.0,
"pre_fac_egroup": 0.1
}
}
Since the default parameters set in PWMLFF are already capable of supporting most training requirements, you can simplify it as shown below, which will use the standard DP
model trained with the LKF optimizer
.
"train": {
"model_type": "DP",
"atom_type": [14],
"max_neigh_num": 100
}
strategy
Used to set the uncertainty measurement method for active learning and whether to use model compression for acceleration.
uncertainty
Set the uncertainty measurement strategy.
The default value is committee
, which calculates the model prediction deviation using a committee of multiple models. This value needs to be used with the model_num
, lower_model_deiv_f
, and upper_model_deiv_f
parameters. Candidate structures are selected if the model prediction deviation falls between lower_model_deiv_f
and upper_model_deiv_f
. Subsequently, DFT is used for labeling. The model_num
parameter sets the number of models when using the committee
method, with a default value of 4
.
If set to kpu
, a single model-based uncertainty measurement using Kalman filter is used. This value needs to be used with the kpu_lower
and kpu_upper
parameters. The kpu_lower
parameter sets the lower bound of uncertainty, and the kpu_upper
parameter sets the upper bound of uncertainty.
lower_model_deiv_f
This parameter needs to be used with "uncertainty":"committee"
and sets the lower bound of deviation. If the deviation value is smaller than this lower bound, the model's prediction for the configuration is considered accurate and does not require tagging. The default value is 0.05
.
upper_model_deiv_f
This parameter needs to be used with "uncertainty":"committee"
and sets the upper bound of deviation. If the deviation value is greater than this upper bound, the configuration itself is considered inconsistent with physical meaning and does not require tagging. The default value is 0.15
.
model_num
This parameter sets the number of models used when using the committee
method for uncertainty measurement. The default value is 4
.
kpu_lower
This parameter needs to be used with "uncertainty":"kpu"
and sets the lower bound of KPU (Kalman Predictive Uncertainty). If the KPU value is smaller than this lower bound, the model's prediction for the configuration is considered accurate and does not require tagging. The default value is 5
.
kpu_upper
This parameter needs to be used with "uncertainty":"kpu"
and sets the upper bound of KPU (Kalman Predictive Uncertainty). If the KPU value is greater than this upper bound, the configuration itself is considered inconsistent with physical meaning and does not require tagging. The default value is 10
.
max_select
This parameter is used to set the maximum number of configurations to be selected for labeling in each round of active learning for each initial exploration structure that does not have the select_sys
parameter set. If the number of structures to be labeled exceeds this value, max_select
structures will be randomly selected from the structures to be labeled. The default is not set, meaning there is no limitation.
For example, in the following MD exploration setting, since the select_sys
is not set, if max_select
is set, a maximum of max_select
configurations will be collected for each of the two structures specified in sys_idx
. Therefore, for this MD exploration setting, a maximum of structures will be collected for labeling.
{
"ensemble": "nvt",
"nsteps": 1000,
"md_dt": 0.002,
"trj_freq": 10,
"sys_idx": [0, 1],
"temps": [500, 800],
"taut":0.1,
"press": [ 1.0],
"taup": 0.5,
"boundary":true
}
compress
This parameter specifies whether to compress the model. The compressed model has slightly lower accuracy butcan be used for faster prediction. The default value is false
.
compress_order
This parameter is used to set the compression method for the model, with a default value of "compress_order":3
, which corresponds to third-order polynomial compression. It can also be set to "compress_order":5
, which corresponds to fifth-order polynomial compression. The higher the order, the higher the accuracy but slightly slower the speed compared to third-order compression.
Compress_dx
This parameter is used to set the grid size for model compression, with a default value of "Compress_dx":0.01
.